DETECTING SQUAREFREE NUMBERS 

ANDREW R. BOOKER, GHAITH A. HIARY, AND JON P. KEATING 

Abstract. We present an algorithm, based on the explicit formula for L-functions and 
conditional on GRH, for proving that a given integer is squarefree with little or no knowledge 
of its factorization. We analyze the algorithm both theoretically and practically, and use it 
£f~) to prove that RSA-210 is not squarefull. 

o 
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O-i 1. Introduction 

Let k be a finite field and / a non-zero element of k[x\. Then it is well-known that / is 

^ squarefree if and only if gcd(/, /') = 1, and the latter condition may be checked quickly (in 

deterministic polynomial time) by the Euclidean algorithm. It is a long-standing question 

L^ in algorithmic number theory whether there is a correspondingly simple procedure to test if 

a given integer is squarefree; in particular, can one determine whether iV e Z is squarefree 
more rapidly than by factoring it? 

In this paper, we describe an algorithm, conditional on the Generalized Riemann Hypoth- 
esis (GRH), for proving an integer squarefree with little or no knowledge of its factorization, 
and analyze the complexity of the algorithm both theoretically and practically. In particular, 
we present some heuristic evidence based on random matrix theory that our algorithm runs 
in deterministic subexponential time 0(exp[(log N) 2 ' 3 ^ 1 ']) . Although this is poorer than 
the performance expected of the current best known factoring algorithms, our method is 

;ZJ able to give partial results that one does not obtain from a failed attempt at factoring. In 

^O particular, we show the following (see £3.2). 






Theorem 1.1. Assume GRH for quadratic Dirichlet L-functions. Then the RSA challenge 
number RSA-210 is not squarefull, i.e. it has at least one prime factor of multiplicity 1. 

j> The challenge number RSA-210, with 210 digits, is significant because it is the smallest 

that has yet to be factored. Certainly the technology to factor it exists (in fact the slightly 
longeij^RSA-704 and RSA-768 were successfully factored in 2012 and 2009, respectively), but 
it remains prohibitively expensive to perform such factorizations routinely. In contrast, the 



proof of Theorem 1.1 could be carried out with a desktop PC in a few months. To our 



knowledge, Theorem 1.1 is the first statement of its kind to be proven without exhibiting 



any factors of the number in question. 
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1.1. Background. We begin with some background on the problem of squarefree testing, 
before describing our main algorithm in £|2} Given an integer N > 1, we first note that if 
N has no prime factors < \/N then it is squarefree if and only if it is not a perfect square. 
Thus, since it is easy to detect squares, in order to prove a number squarefree it suffices to 
find all of its prime factors up to the cube root. On the other hand, the Pollard-Strassen 
algorithm [T9l 122] finds all prime factors of N up to a given bound B in time £ (iV e vZ?)- 
This immediately yields an algorithm for squarefree testing in time O e (A^ 1 / 6+E ). We remark 
that with some modifications to the Pollard-Strassen algorithm, along the lines of |5j but 

/ 1 c ■. 

specific to this problem, one can improve the running time very slightly to 0[N 6 i°gi°g^J 
for some c > 0. 

Although Pollard-Strassen is often regarded as a purely theoretical result, with modern 
computers it is possible to implement it and realize some improvement in speed over trial 
division. However, the gains do not occur until B is of size 10 9 at least. As a result, even 
the modified algorithm mentioned above is only practical for N up to 10 70 or so. On the 
other hand, the Quadratic Sieve algorithm running on a PC will, in practice, almost surely 
factor a given N < 10 70 within a few minutes; thus, at least with present algorithms and 
technology, it is always better to try to factor the given integer. 

1.2. Fundamental discriminants. Our approach rests on a way of characterizing the 
squarefree integers that does not directly refer to their factorization. Precisely, if d & Z, 
d = 1 (mod 4), then d is squarefree if and only if it is a fundamental discriminant. (Note 
that if iV e Z is odd then d = (— l) - 2~iV satisfies d = 1 (mod 4), so this restriction entails 
no loss of generality.) The advantage of this criterion is that whether or not a given discrim- 
inant d is fundamental can be detected from values of the quadratic character Xd(n) = (^), 
where (-) denotes the Kronecker symbol. In turn, Xd(n) is easy to compute for a given n, 
thanks to quadratic reciprocity; in particular, if n is a prime then the Kronecker symbol (^) 
reduces to the Legendre symbol, which can be evaluated, e.g., by Euler's criterion. 

Let T denote the set of fundamental discriminants. To see how one might use the above 
to prove quickly that a given d is squarefree, note first that we have in general that d = A£ 2 , 
where A 6 J and £ G Z>o- Here |A] is an invariant of the character Xd (its conductor), 
which we aim to show equals, or is as least close to, \d\. By testing whether d is a square, 
we may assume without loss of generality that A^l. 

For any x > 0, consider the series 

ID sm*) = ^ £**(«) Pr^ 1M/ V' ( "^, 

Jx *— ' \x/ 

v n=l 

which is essentially the twisted ^-function. Note here that we may calculate Xa(^) for any 
given n, even without knowledge of A; in fact, we have Xa(^) = Xrf( n ) unless n has a common 
factor with £. We may assume without loss of generality that we never come across such an 
n, since otherwise we will have found a square factor of d, answering the original question. 

If one thinks of the character values Xa(^) as "random" ±1 then, thanks to the decay of 
the Gaussian, the series in (IT]) is the result of a random walk of length about x, which will 
typically have size on the order of -y/x; thus, one might expect Sa{x) to oscillate, without 
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growing very large or decaying, as x — > oo. This turns out to be an accurate description for 
x up to v/|A|, but for larger x, S/±(x) is constrained by the symmetry 

(2) S A (x) = S A (\A\/x), 

following from the Poisson summation formula. 

The point of symmetry of rt2J) gives an indication of |A|, and thus we can rule out small 
values of |A| essentially by drawing the graph of Sa(x). More precisely, for any given B > 0, 
one can decide whether or not |A| < B in time 2 \0 £ [N £ yB) , which matches the running time 
of Pollard-Strassen for the same task. Moreover, if one could find a method of computing 
the ^-function Sj\{x) substantially more quickly than by direct in-order summation, say in 
time £ (N e x 1 ~ 5 ) for some S E (0,1), then this improves to O e ( K N e B^ l ~^)- 1 in particular, 
taking B = JV^- 2 * and falling back on Pollard-Strassen to rule out £ < wN/B, we would 
get an algorithm to certify N squarefree in time £ ^N^^~ 3-2,5)+^. 

2. The explicit formula 

Our main interest, however, is in algorithms that work in subexponential time. This is 
difficult to attain in the above approach because we used sums over integers n. It is well- 
understood in problems of this type that one can do better by considering sums over primes, 
at the expense of having to assume GRH. 

To be precise, let L(s,xa) — Y^=iXA(n)n~ s be the Dirichlet L-function corresponding 
to A 7^ 1. Assuming GRH, the non-trivial zeros of L(s, xa) may be written as | ± ijj(A), 
j = 1,2,3,..., where < 71(A) < 72(A) < . . ., and each ordinate is repeated with the 
appropriate multiplicityn Further, let g : [0, 00) — > C be a test function which is continuous 
of compact support, piecewise smooth, and has cosine transform h(t) = 2 J °° g(x) cos(tx) dx. 
Then the "explicit formula" for L(s, xa) reads 

g(0) log I A[ = 2 f; h( 7j (A)) + 2 f; A(n) ^ (n ^ (l gn) 



Jn 

(3) J =1 n=1 

+ g(0) log(8vre^) - /°° ^ '^ dx + XA (-1) f°° 9 ^ dx, 
J 2smh(x/2) J 2cosh(x/2) 

where A is the von Mangoldt function. 

If not for the sum over zeros 7j(A), this would be exactly what we seek, i.e. a formula for 
the conductor |A| in terms of character values. Without knowledge of the zeros, we do not 
get such an exact identity, but we can at least get an inequality in one direction if the test 
function is chosen so that h is non-negative, i.e. 

log |A| > 2 J2 A(n) ^ (n) ff(logn) + log(87rO 

(4) n=i ^ n 



[ ' 9{x) dx+xA(-i)[ n r/^ dx 



2sinh(x/2) A v ' J 2cosh(a;/2) 



We omit the proof of this, but the main point is the fact that 5a (e x ) is the Fourier transform of the 
complete L-function A(| + it, xa), so it is essentially band-limited. 

If L(s,xa) has a zero at s = | of multiplicity m, then m is necessarily even, and we take ? copies of 
this zero, i.e. 7, (A) = for j < ^ and 7^+1 (A) > 0. 
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for any g : [0, oo) — > R which is continuous of compact support, satisfies g(0) = 1, and has 
non-negative cosine transformjj 

In the next few subsections we explore some strategies for exploiting Q to prove that our 
given d is squarefree. The proofs of Propositions |2.1 2.3 below are given in the appendix. 

2.1. Varying the test function. Our first, and simplest, strategy is to search for a test 
function such that the right-hand side of Q is close to log |A|. Naturally, we pay a price for 
ignoring the zero sum Z = X^i M7j(^))> i n that our estimate for |A| is a factor of e 2Z too 
small. We can still use this to prove that d is squarefree by ruling out values of £ < e z using 
Pollard-Strassen or otherwise, but this takes exponential time 3> e z ^ 2 in the size of Z. 

On the other hand, note that the sum over prime powers in Q is exponentially long, i.e. 
if g has support [0, X] then we need to compute the right-hand side of (J4J) for n up to e x . 
Thus, we would like X not to be very large. However, if we choose X too small then, by the 
uncertainty principle, the cosine transform h will be relatively "wide", so that the zero sum 
will typically be large. 

Our first result shows that there is an optimal choice of test function for each fixed X, 
and thus an optimal tradeoff between these two exponential penalties. 

Proposition 2.1. Let C(X) be the class of functions g : [0, oo) — > R that are continuous, 
supported on [0, X], have non-negative cosine transform, and satisfy g(0) = 1. For g e C(X), 
let 1(g) denote the right-hand side of Q. Then for every X > there exists gx € C(X) 
such that l(gx) > 1(g) f or a ^ 9 e C(X). 

We remark further that if g G C(X) then its cosine transform h is band-limited, and so, 
by Jensen's formula, h has at most Ox(T) zeros in the interval [—T,T] for large T. Since it 
is known that L(s,Xa) nas ^ TlogT distinct zeros with imaginary part in [—T,T], under 
GRH the zero sum in ^ cannot vanish, so that Q is a strict inequality for any fixed X, i.e. 
log | A] > l(gx)- However, it is easy to see that l(gx) tends continuously and monotonically 
to log |A| as X — > oo. 

Although Prop. 2A_ is an existence result only, one can try to solve for the optimal test 
function gx by approximating C(X) using a sufficiently rich, finite-dimensional space of 
functions. For instance, let M be a non-negative integer, and consider step functions / of 
the form 

v^ (2M+1 

(5) f{x) = }_^ Onl(-l/2,l/2) I — -^ — x - n 

where the coefficients a n are real and satisfy a_ n = a n . If we set g = f * f, then the right- 
hand side of (El) is a quadratic form in the a n , and the condition g(0) = 1 amounts to an 
L 2 -normalization. Thus, we can find the optimal lower bound for this family of test functions 
by computing the matrix of the form and finding its largest eigenvaluejj It is not hard to 

A function g satisfying these conditions need not be piecewise smooth, and in fact L „ ■~^(j 2 \ dx may 

be divergent. However, since g has non-negative cosine transform, 2 s inh(x/2) ^ s non-negative, so we may 
interpret the right-hand side of Q as — oo whenever the integral diverges. With that convention, a standard 
approximation argument shows that Q holds for all g as indicated. 

If A is the matrix associated with the quadratic form and c := (2M + 1)/X, then maxi i2 =c a f Aa = cAi, 
where Ai is the largest eigenvalue of A, a := (a_M, ■ • ■ ,«m), and the condition \a\ 2 = c is equivalent to 

.9(0) = !. 
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see that this family comes arbitrarily close to the optimal gx as M — > oo, although it may 
be the case that gx is highly oscillatory, meaning that we would need to take M very large 
before finding a close approximation to it. 

2.2. Twisting. A second strategy, which performs well in practice, is to "twist" our given 
quadratic character \d by other characters Xqi and look for a q for which the lower bound in 
Q is favorable. This is related to the first strategy since, by Fourier analysis, varying the 
test function amounts to considering combinations of the twists by n lt for various tjj Twists 
by quadratic characters have the added advantage of zero repulsion around the central point, 



as we explain in detail in §3.1| In other words, if we run out of luck with our given value 
of d then we can multiply it by q G J 7 relatively prime to d and ask if the product is a 
fundamental discriminant. This operation also introduces a penalty, since Q becomes a 
lower bound for log |gA|, so we have to subtract log \q\: 

log IAI > - log Igl + 2 V A ( W )^H fl(logn) + log(87re 7 ) 
(6) ~ - ^ 

1 \ 9 } X) dx + XqA (-l) r 9 ^ , - dx. 
2sinh(x/2) AqAK ' J 2cosh(a;/2) 

What we gain by this strategy is the hope of finding a twist XqA such that the low-lying 
zeros of L(s, XqA) are unusually sparse, so that the zero sum ^2'^ = ih(jj(qA)) can be made 
small even with a relatively simple choice of g. For instance, we might hope that L(s, X?a) 
has a large zero gap around the central point. In that case, we have the following. 

Proposition 2.2. Suppose that L(s,x q A) satisfies GRH and has no non-trivial zeros with 
imaginary part in (—6, 5). Set X = 25~ 1 (A + log log \qA\) for some A > 0. Then there is an 
explicit g G C(X) whose cosine transform h satisfies 

°° e~ A X 

^ (loglog|gA|) 3 / 2 

with an absolute and effective implied constant. 

In other words, there is a test function g with support of size inversely proportional to the 
size of the zero gap for which the zero sum is relatively small. Thus, ruling out small values 
of £ to complete the proof that d is squarefree is fast compared to evaluating the explicit 
formulaEI 

Although it is difficult to ascertain directly for a given q whether L(s, x<?a) has a large 
zero gap, we can simply try computing the lower bound (|6| using the test function given 
by Prop. |2.2| for a particular desired value of 6. We may repeat this procedure for many q 
until we find one which is good enough, and then use the quadratic form approach with a 
relatively small matrix to refine the choice of test function. 

The crucial question is thus how large of a zero gap can one expect to find by searching 
through various q. On average, one expects the first zero gap around the central point to 



be about 27r/log \qA\, which is of no use in Prop. 2.2 On the other hand, if we found q of 



We will sometimes write t- twists and q- twists to mean twisting by n %t and \q- 

In fact, as a by-product of evaluating the explicit formula, we will test d for divisibility by all primes 
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p < e x . Thus, if ^ h(jj(qA)) < X then no additional work is necessary to prove that d is squarefree. 



modest size for which the first zero gap was on the order of l/yIog|gA|, say, then we would 
have a fast algorithm for proving that d is squarefreejj 

To make this more precise, anticipating a subexponential running time on the order of 
exp((log lAI) 9 ), for 9 > we define 

M A (6) = max j 7l (gA) : ? GJ, (q, A) = 1, \q\ < exp((log |A|) e ) }, 

^A(g) = - !° g f fA | ( ?p r /oo (^)=limsupr ?A (^), 9* = mf{9 > : Voo (9) < 9}. 
log log | A | Ae ^ 

|A|-+oo 

Thus, if we search through all q E J 7 with \q\ < exp((log |A|) 9 *), we expect to find at least 
one for which the first zero gap of L(s, XqA) has size on the order of (log | A\)~ e * . Combining 
this with Prop. |2.2 and a brute-force search strategy, we obtain the following. 



Proposition 2.3. Assume GRH for quadratic Dirichlet L-functions. There is an algorithm 
that takes as input a positive integer N and outputs either a non-trivial square factor of 
N or a proof that N is squarefree. If N is squarefree then the algorithm runs in time 
0(exp[(logiV) r+o(1) ]). 

Note that the assumption of GRH in the proposition applies to the certificates generated 
by the algorithm as well as its running time analysis. 

2.3. Examples. In Figure [TJ we give a basic illustration of the favorable situation of a large 
gap around the central point. We chose L(s,Xd), where d = 1548889 is a fundamental 
discriminant, because it has a gap size ~ 1.747424 there, which is about 4 times the average 
27r/log(i ~ 0.44. Therefore, we expect the lower bound Q to be quite good even with a 
simple choice of g. For instance, if M = and a^ = 1/yX in (J5J), we obtain g(x) = max(0, 1 — 
\x\/X) and h(t) = X sm 2 ( y Xt/2)/(Xt/2) 2 . Choosing X = 7/2, we have 2 ^>i KlA d )) « 
6.73, and so the lower bound Q is \ogd — 6.73 ~ 7.5. This suffices to prove that d is 
squarefree, since in computing the prime sum of the explicit formula we will have checked 
that d has no factor < e 7 / 2 , and so certainly no factor < e 673//2 . In particular, Q allows one 
to certify that d is squarefree from the primes < e 7//2 ~ 33 only, which is is already better 
than trial division. Thus, our strategy can lead to a gain even for small d. 

The behavior of the lower bound as X increases is worth noting. We illustrate it for 
L(s,Xd), Figure [2] (left plot), using the same simple choice of g as before. The overall shape 
of the plot is typical for the case of a large gap, as L(s, Xd) has. It suggests that, in the 
case of a large gap, there will be an initial (good) region where the lower bound increases 
steeply, followed by an inevitable, unless L(l/2,Xd) = 0, region of small oscillations. If the 
gap about the center is not particularly large, however, then the initial good region will be 
much smaller. This is illustrated in Figure [2] (right plot) using the L-function of a randomly 
chosen fundamental discriminant, L(s, X2000005)) which has an average size gap of ~ 0.515984 
about the center. Notice that there is a wide good region later on in the plot, but it comes 
in too late to be useful in our algorithm. 



If there is a constant e > such that one can always find a 71 (qA) > (log log |A|) 1+e /log |A|, then 



Prop. 2.2 already allows one to certify that an integer is squarefree in subexponential time (on the GRH). 
It would be interesting to see if one could push this line of thought to an improvement of the (^(TV 1 / 6 ^ 1 )) 
time bound of Pollard-Strassen, but we do not do so here. 
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FIGURE 1. Large zero gap around the central point of L(s, X1548889)) together with the test 
function on the zero side h(t) = 8sin 2 (7£/4)/(7£ 2 ) resulting from M = and X = 7/2. 
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FIGURE 2. Behavior of the lower bound Q as X increases: The case of a large gap (left) 
compared with the case of an average gap. 



3. Complexity 

By computing the 1-level density of the family of twists by Xq-> Q £ J~, one can see 
that rj oc {9) < 1 for 9 > 1, so that 6* G [0,1]. However, the algorithm of Prop. 2.3 is 



subexponential only if 9* < 1, which unfortunately seems beyond the current technology to 
prove, even under GRH. We can, however, make a reasonable conjecture of the value of 9* by 
answering the analogous question for a suitable random matrix model, where the calculation 
is more tractable: 



Conjecture 3.1. We have 



rioo{0) 



if0<9 < 1, 
if9>l. 



In particular, 9* = | . 



Thus, by Prop. 2.3[ we conjecture that our algorithm is capable of certifying N squarefree 



in time 0(exp[(logiV) 2 / 3+0 «]). 

We give a detailed justification for the conjecture in §3.1| below. First, however, it turns 
out that one can arrive at the same conclusion for the running time without any consideration 
of the zero sum in (|3J), by analyzing the lower bound Q using a simple model of the x<?a(p) 
as independent random variables assuming the values 1 and —1 with equal probability. We 
make this more precise in the following proposition, which is a consequence of [151 Theorem 

!]• 

Proposition 3.2. Let Yi,Yz, ... be independent random variables such that P(lj = 1) = 
P(Y ? - = —1) = |, and put Y := 2 ^„. <e x 3 ^- Pj ( 1 — ^jp- ) , where pj denotes the jth prime 
number. Then, for each n satisfying 3 < n < e x , we have 

HY > v n ) > 2- 22 exp -^ , P(Y > u n ) < exp ' 
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oZC n t 
where v n := E Pj < n ^ (l " *§*),«» := 4v n , and c n := £ n<Pj < e , ^ (l " *¥)' 

In particular, as n,X — > oo with n = e olyX \ so that v n ~ 2y/n and c n ~ J2X 2 , we get 
P(y > l^fn) > exp(— (1440 + o(l))n/X 2 ). Therefore, after [expXj independent samples 
of Y, we expect to occasion Y > ttttsX^I 2 at least once. In the opposite direction, we 
have P(y > &yfn) < exp(— (24 + o(l))n/X 2 ), and so after [expXj independent samples, 
we expect at most one instance of Y > -4=X 3 ^ 2 . Together, these estimates are consistent 



with 6* = 2/3. Of course, Prop. 3.2 simplifies the situation by ignoring the higher prime 
powers, but that is not important since they contribute only O(X), and so do not impact 
the X 3 / 2 term. It is worth noting, however, that the contribution of the higher prime powers 
in numerical computations is still noticeable because Xq{p 2 ) = 1 whenever (q,p) = 1, and 
so the bulk of their contribution is guaranteed to help our lower bound, regardless of the 
number of samples. 

3.1. A conjecture for 6* via random matrix theory. The random matrix philosophy 
suggests (e.g. by comparing the 1-level densities) that the relevant symmetry for a family of 
primitive quadratic twists is symplectic. The symplectic group USp{2N) is a compact group 
consisting of 2N x 2N unitary matrices A satisfying A 1 J A = J, where 

T _ ( ° ^ 

•" V -^ 

The eigenvalues of A lie on the unit circle, come in conjugate pairs, and can be written 
uniquely as 

e ±ie ^ A \ . . . , e ±idN{A \ < e^A) < ■ ■ ■ < 6 N (A) < vr. 

Making the identification 2N = log|A|Jjwe expect that statistics of the lowest eigenphase 
9\ (A) as A varies in USp(2N) coincide to leading order, and modulo arithmetic effects, with 
statistics of the lowest zero 71 (gA) as q varies in J 7 but still sufficiently small compared 



This identification is obtained by equating the mean spacing of eigenphascs of A € USp(2N), which is 
tt/N, and the mean spacing of zeros of L(s, x«a) at a fixed height, which is ~ 27r/log |gA| ~ 27r/log |A| as 
IAI -4oo,AeJ. 



to |A|. Thus, by computing statistics of 9i(A), we arrive at conjectures for 71 (qA). In 
particular, since the complexity of our algorithm depends on the frequency of large values 
of 7i(gA), we are led to consider the tail distribution of 61(A). 

To this end, and to facilitate comparison with other symmetry groups later on, let U(N) 
denote the (compact) group oi N x N unitary matrices, and SO(2N) C U(2N) the group of 
orthogonal matrices of determinant 1. The eigenphases of A G U(N) can be written uniquely 
as < 61(A) <•■■ < 9n(A) < 2n, while those of A G SO(2N), which come in pairs ±6j(A), 
can be written uniquely as < 61(A) < . . . < 6n(A) < it. Let Pg(jv) denote the unique Haar 
measure on G(N) G {U(N), SO(2N), USp(2N)}, normalized to be a probability measure. 
The random matrix philosophy suggests, for example, that the relevant symmetry group for 
averages over a family of twists by n lt is unitary, while for averages over a family of elliptic 
curves it is orthogonal (even or odd, depending on the sign of the functional equation in the 
family) . 

Let P^cao = ^g(n) x ■ ■ ■xPg(jv); repeated M times, be the product measure on G(N) M . For 
each Borel-measurable set J C [0, or], where o = 2 if G(N) = U(N) and a = 1 otherwise, 
define S(J) := {(^i, . . . , Am) G G(N) M : maxi< m <A/ 9i(A m ) G J}. For short-hand, we write 
P G pv)(maxi< m < M 0!(m) G J) in place of F^ N) (S(J)), P G(Ar) (maxi< m < M 6»x(m) > s) in place 



of Fq, n JS((s,oo))), and so on. The distribution function Pg(at)(#i > s) is known as the 



gap probability. The proofs of Propositions |3.3f|3.4| below are given in the appendix. 
Proposition 3.3. Fix G (0,2), and define 

M (N) := Lexp((2iV^)J , s%(N) := (4 ± e)(2Nf' 2 -\ 
Then, for each fixed e > 0, as N — > 00 we have 

^us P (2N) (s-JN) < max 9i(m) < s+JN)) -)• 1. 

In other words, (2iV) 1_/3 / 2 maxi< m <Ma(iV) 6i(m) converges in distribution to 4. 

Therefore, if we choose Ai,...,Am(n) G USp(2N), independently and uniformly with 
respect P[/s p (2jv)> then in the limit as iV — > 00, we have maxi< m <M(jv) 9i(A m ) > s~p(N) 
with probability approaching 1. For instance, if /3 — 1, then we expect to find at least one 
lowest eigenphase of size > (4 — e)/y2N . Since the eigenvalues of symplectic matrices come 
in conjugate pairs, this corresponds to an eigenphase spacing > 2(4 — e)/y2N ~ ^^/32/N, 
which is -y2N times the average spacing. 

In contrast, the eigenvalues of unitary matrices do not necessarily come in conjugate pairs, 
so the point 1 on the unit circle is no longer distinguished, and actually P[/(at) is rotationally 
invariant. Thus, it is more natural to consider the nearest-neighbor distribution function, 
F U{N )(9 2 -9i>u) := F U{N) ({A G U(N) : 9 2 (A) - 9 X (A) > u}), which is related to the gap 



probability by differentiation; see (23) in the appendix. In fact, log P^at) (9% — 9\ > u) ~ 
log P[/(jv) (#i > u) as N — > 00 over a wide range of u. To facilitate comparison with the 
symplectic case, we consider the half-spacings \[9 2 — 9i] in the following. 

Proposition 3.4. Fix (3 G (0,2), and define 

Mfs(N) := [exp(iV^)J , uf >0 (N) := ^W±~eN p/2 -\ 
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Then, for each fixed e6 (0,8], as N — >■ oo we have 

F u(N) (u-JN) < max \[9 2 {m) - 0i(ra)] < u 



E, 



In oi/ier words, N l ~P' 2 m.ax.x< m <MB{N) ^[#2(77?,) — #1(777)] converges in distribution to \/8. TTie 
same zs ^rwe ?/ ; m addition, one maximizes over the spacings of each matrix, replacing 
^[9 2 (m) — 6 ) i(m)] by maxi<j< N \[9j + i(m) — 9j(m)}, where 9n+i '■= 27r + Q x . 




In light of Prop. 3.4, we see that sampling from U(2N) does not do as well as sampling 
from USp(2N). For if we choose |_exp((2iV)^)J matrices from U(2N), independently and 
uniformly with respect to Pf/(2jv)> then we expect that half the max spacing is ~ y/8(2N)^^ 2 ~ 1 , 
which is worse than the symplectic case by a factor of \/2. Note that we could have compared 
USp(2N) and U(N) instead, but to do so meaningfully the eigenphases in the two ensembles 
should be re- normalized to have the same mean spacing, so that U(N) still does worse by a 
factor of \/2. 

This suggests that our algorithm should do better if it searches through g-twists rather 
than t- twists, i.e. 71 (qA) as opposed to | [7^+1 (A) — 77(A)], for q and j in a suitable rangeFn 
This agrees with our observations in practice. An additional reason for it might be that 
the assumption of independent samples is less applicable to the gaps 7^+1 (A) — 77(A), since 
they come from a single L-function, and are thus constrained by its analytic properties, in 
contrast to the 71 (qA), which come from different L-functions. For example, we intuitively 
expect 7 J+ i(A) — 7/ (A) to have negative correlations over short ranges (and also some long- 
range correlations due to the primes; see [16J for a numerical discussion of this in the case 



of zeta). While such negative correlations do not affect the 2/3 in the analogue of Conj. 3.1 
for the t-aspect, they likely make the implied asymptotic constants worse. 

In order to make a conjecture for 9* based on our USp(2N) calculation, we identify 2iV 
with log |qA|, as usual, and the lowest eigenphase with ji(qA). If 9 < 1 then twisting by 



X q does not affect the density of zeros appreciably, so we may interpret Prop. 3.3 for fixed 
2N Pd log I A| as sampling 71 (qA) for q from {q 6 J 7 : (q, A) = 1, \q\ < exp((log | Al) 9 )}. The 
conclusion of the proposition thus suggests that M&(9) x (log |A|) e / 2-1 ; in particular, we 
expect 7/00 (#) = 1 — 9/2, and so 9* = 2/3. On the other hand, if 9 > 1 then q becomes the 
dominating factor in the zero density; thus, we expect the maximum of 71 (qA) to be attained 
for a relatively small choice of q, meaning we do not derive any benefit from increasing 9 
further, and 7/00(0) is constant. Note that similar conclusions are reached if we sample twists 
by n lt instead. 



Finally, we remark that Conj. |3.1| is of independent interest and may warrant further 
study. One can try to confirm it directly (i.e. by computing the first zero of many twists), 
but this requires taking |A| fairly large before one can hope to discern a clear pattern. Basic 
experiments suggest taking |A| > 10 15 , say, which is prohibitively time-consuming using 
the standard approximate functional equation, as one would need to compute 71 (gA) for 
millions of q. It would perhaps be better to formulate a precise conjecture for M&{9) itself, 
including lower order terms, and check the numerics for that. One could also try to confirm 
the conjecture for other families. 



10 To clarify the analogy with U{N) a little more, we expect max t < 7j .(A)<t+27r \[lj+i{^) — 7j(A)], for 



t = lAj ' 1 ), to be modelled by maxi<j<Ar ^[9j + i(m) — dj(m)], where N ~ log |A 



2L 
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3.2. Numerical results. We applied our method to a few RSA-numbers, but our main test 
case was RSA-210, which is the following 210-digit number: 

RSA-210 = 2452466449002782119765176635730880184670267876783327 
5974341445171506160083003858721695220839933207154910 
3626827191679864079776723243005600592035631246561218 
465817904100131859299619933817012149335034875870551067. 

We searched for candidate twists (i.e. twists that are expected to make the prime sum 



large) essentially by brute-force, with some modest refinements described in £4.2.1 We first 
used a simple weighting function, such as a triangle wave, to evaluate a short prime sum 
(typically with p < 10 4 ) for all twists within a given range, then incrementally increased the 
length of the sum as we filtered the results. The candidates found this way were then fed 
into the lower bound Q, this time using a much lo nger prime sum and the test function 



produced by the quadratic form method outlined in £ 2.1p Our best-performing twist was 



-9334602088654580277283 = -568391 x 2345033 x 7003250461, which yielded a lower bound 
of 137.5158 using the primes p < 1.3 X 10 15 . 

Proof of Theorem \l.l\ Suppose, to the contrary, that all prime factors of N = RSA-210 have 
multiplicity > 1. Then iV = s 2 |A| 3 , where |A| is the conductor of X-n- We verified that iV 
is not a perfect cube, and our computation showed that log |A| > 137.515, so that 1 < s < 
1.3 x 10 15 . This leads to a contradiction since, as a by-product of the computing the prime 
sum in the explicit formula, we checked that N has no non-trivial factor < 1.3 x 10 15 . □ 

Remark. We do not need the full strength of the bound log|A| > 137.515 to prove the 
theorem, as we separately ruled out factors < 10 20 using the implementation of Pollard's 
p — 1 methocrjin GMP-ECM [?], which takes less than a day on a computer with 80GB of 
memory. Therefore, the bound log |A| > 130.02 suffices to prove the theorem, which reduces 
the size of the prime sum needed to p < 2.66 x 10 14 . 

In Figure [3j we present data about the practical efficiency of our algorithm, providing fur- 
ther evidence for the 2/3 exponent. To clarify the situation, recall that 9* is chosen to balance 
the number of terms in the prime sum versus the number of twists we need to try so that, 
with high probability, the zero contribution is small for at least one twist. We accomplish 
this by taking g with support inversely proportional to the largest gap that we anticipate 
after trying exp((log |A|) e ) twists, and so our prime sum has length < exp (ci(log |A|)' 7A ^) 



for some suitable constant C\ > 0. The 2/3 arises as Prop. 3.3 suggests that rj^(6) ~ 1 — 9/2 



and on solving 1 — 9/2 = 9. More precisely, it arises since if we sample 7 X (gA) over q e J 7 , 



Note that the integral in (I3J) can be computed to high precision using standard numerical integration 
methods. Moreover, in our final, long prime sum, we used approximations of the test function by Chebyshev 
polynomials, which allows most of the summation to be carried out in integer arithmetic. In this way we can 
effectively control the round-off error in the computation. On the other hand, since the longest sum that 
we computed was over the primes < 1.3 x 10 15 , standard double-precision arithmetic would also suffice to 
control the round-off errors effectively for many choices of g, e.g. by using pairwise summation. 

Like Pollard-Strassen, this method can be used to rule out small factors, with comparable complexity. 
Pollard-Strassen has a theoretical advantage, in that the p — 1 method produces an inconclusive result if a 
randomly-chosen residue happens to have exactly the same order modulo every prime factor of N; however, 
the chance of that occurring is vanishingly small, so this is irrelevant in practice. 
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FIGURE 3. The lower bound (under GRH) produced by our method when applied to 
RSA-210 using the primes < e x . The *s mark the places where the best performing twist 
available so far changes. The slope increases noticeably at each *, except towards the end, 
where it likely that we are not finding the best twists. Also note that the V2 in our fitted 
curve is likely not an absolute constant, but varies as a small power of log log |A|. 



|g| < expX, with X much smaller than log |A 

at least one 7i(gA) > 4VX/log |A|. Therefore, looking back at Prop. 



then Prop. 3.3 suggest s we should encounter 

we expect to ob- 



tain a lower bound very close to log |A| in time < exp(X + 



C2 lof 



2.2 

|A|loglojTS 



4 V / X 



where C2 > is 



a constant implied by the proposition. Optimizing, we choose X — (^ log |A| log log |A|) 2 / 3 . 
This reasoning on its own does not fully explain what we observe in Figure |3j which is that 
by sampling exp X twists and using a prime sum of length exp X, we seem to obtain a lower 
bound like X 3 / 2 , even for intermediate values of X much smaller than (log |A|) 2 / 3 . This 
behavior is expected if one treats the prime sum as a sum of independent random variables, 
as in Prop. |3.2 , but it would be reassuring to see it from the zeros directly. The difficulty 
towards this is that if h does not have sufficient decay outside the large gap, then we cannot 



bound the contribution of the zeros effectively (cf. Prop. 2.2). Nevertheless, one can obtain 
a heuristic explanation, as follows. 

We choose h with < hit) < 1, say, and mostly concentrated within the interval 
[-1/X, 1/X], roughly speaking. We let N x (t) := #{0 < 7(x) < t} denote the zero-counting 
function, and assume L(l/2, %) ^ for simplicity. Then £wy) ^(t(x)) = 2/ °° h(t) dN x (t). 
The contribution of the smooth part of N x (t) to the integral is ~ g(0) log |A|, which is pre- 
cisely the left-hand side of pi). Therefore, the prime sum contribution, which is basically our 
lower bound, should be ~ —2f^°h(t)dS x (t), where S x (t) is the fluctuating part of N x (t). 
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This last integral is typically very small due to the random nature of S x , except we pur- 
posefully introduced a bias in it via our choice of twist, resulting in a large gap around the 
center, of size like yX/ log |A|. The contribution of this bias to the prime sum is essentially, 

for a reasonable h, —2 J Q h(t) dS x (t) ^$> \/X. Since we expect the contribution from 

the interval [vX/log |A|, oo) to wash out in comparisonrjwe should get a lower bound like 
g(0) log |A| > \/X. Finally, since g(0) < 1/X, we should get log |A| > X 3/2 . 

This heuristic indicates how the running time of our algorithm is controlled by extreme 
(negative) values of S x (t). If S x (t) -Ct (log | A|) 1 / 2+ ° ( ^ 1 ' ) , for example, then we cannot expect 
a running time better than exp((log | A|) 1 ' 2+o( - 1 ' 1 ), even if we allow for an oracle supplying the 



algorithm with the best twist in any requested range. (This is in agreement with Conj. 3.1 
On the other hand, if S x (t) can get much larger (without violating the GRH, so our method 
can still apply!), then there is no such barrier. 

4. Refinements 

In this section, we describe a few refinements of our basic method and indicate some 
directions for future research. 

4.1. Linear programming. A natural question is whether one can make better use of the 
zero sum in pi) than simply ignoring it by positivity, as in Q, especially since it typically 
dominates the right-hand side when X is small. One idea is to apply the explicit formula (pi) 
with various choices of test function, setting up a system of inequalities, and try to obtain a 
non-trivial lower bound for the sum over zeros. Since log |A| also appears in (J3J) and remains 
unknown to us, the logic of this may seem circular at first glance, but we gain some additional 
information coming from the fact that the zeros occur discretely, as we elaborate below. 

An immediate practical problem is that the system involves infinitely many variables, 
since the zero sum is infinite, and h cannot be compactly supported (it has to be analytic). 
Nevertheless, one can reduce to a finite number of variables, without too much loss, using 
an explicit estimate of the form | X]i 7 i>t Mt)I = |2 f T h(t) dN x (t)\ < S(h, T), T > 0, simply 
bounding the conductor by the modulus, and using known estimates for S x (t). Hence 

(8) Yl h^)-S(h,T) < J>( 7 ) < £ fc( 7 ) +E(h,T). 

h\<T 7 \f\<T 

A more serious problem is that the system is not linear in the zero ordinates, and therefore 
is likely very unstable. We linearize the system, at the cost of having more variables or 
extra solutions, by subdividing the interval [0, T) into bins of size 8, so that the variables 
become the count of zeros in each bin rather than the zeros themselves. Specifically, for 
each integer V > 0, and each integer v G [0, V), let 8 := T/V, I{y) := [vS,(v + 1)5), 
m(v) := |#{7 : | 7 | € I( v )}i h + (v) := sup tg/ ( v ) h(t) , and h~(v) := inf tel(v) h(t) . Then we 
have 

(9) 2 Y m(v)h~(v)< J^/i(7)<2 ^ m ( v ) h+ ( v ) ■ 

0<v<V | 7 |<T 0<v<V 



This is the part of the heuristic that we cannot prove, even under the GRH, unless h has sufficient decay 
outside the zero gap. 
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Applying (|8j and ^ with a set of test functions {(gk, hk) : 1 < k < K} of our choice, we 
obtain a linear system 

2 ^ m(z;)^^)-£(^,T)<^(0)log|AH-^(0)logg-P(^,g) 

(10) °^ <V 

<2 J2 m{v)ht{v)+£{h k ,T) 

0<v<V 

for k = 1, . . . , K, where Xq is the twist used, and P(gk, q) denotes the contribution from 
the prime sum and integral terms in pj). Note that V controls the size of each bin, and T 
controls the point where we truncate the zero sum. Finally, we let logd denote the unknown 



value of log |A|, and feed the system (10) into a linear programming solver, such as GLPK [?], 
with logd as the objective function to be minimized. 

We experimented with this approach for RSA-210 using various choices of q, T, V, and 



{(9ki h k ) : 1 < k < K}. For example, one of the better performing twists found, as in £3.2 



was q = —65123121667. Using this twist, we set up the system (10) with T = 4 and V = 500 



(so that 5 = 0.008, which is smaller than the mean zero spacing « 0.013), and 



h k (t) 



sm(Xt/2k) l2i 



fc = l,...,7, X = 71ogl0, 



(Xt/2k) 

so that gk(x), k = 1, . . . , 7, are supported on \x\ < X\^j We imposed an integer variable 
constraint on m(v ), v — 0, . . . , 44, with the rest being real variables. The integer variables are 
located at the beginning, covering the interval [0, 0.36), which is reasonable since hk(t) is not 
too small there and so detected zeros have more weight. Solving this system, we obtained 
the lower bound log|A| > 47.153, of which 2.494 came from the zeros. This represents 
an improvement of about 5.5% over using maxi<fc< 7 [P(gfc, q) — gki^)\ogq] alone, which is 
comparable to the improvement that we obtained from using the Pollard p — 1 algorithm 



to rule out small values of £, as remarked after the proof of Thm. 1.1 Although this is a 
modest improvement on a logarithmic scale, it makes a substantial difference in the length 
of the final prime sum. 

In general, further gains are possible by using more integer variables, a smaller grid spacing 
(smaller 5), or additional test functions, in that order of importance. In reality, adding more 
test functions of compact support of size X loses impact quickly, which is not surprising 
because such functions cannot resolve zeros to better than 0(1 /X). Most of the gains, in 
fact, come from imposing integer constraints. If no integer constraints are imposed, the 
improvement in the above example goes down significantly, to around 1%. Also, if all the 
variables are real, then the linear programming approach is closely related to the approach 



of varying the test function, described in §2.1[ and so cannot be expected to do significantly 
betterEE 

However, one has to weigh the extra time it takes to set up and solve the mixed integer 
programming problem against the time it takes to simply compute a longer prime sum. In 



14n 



The inequalities in (10) were imposed in both directions except for hi, where only the lower bound was 
used. 

In the real variable case one can obtain an easily verifiable certificate that the solution is indeed correct 
by solving the dual problem. This is not available if one imposes integer constraints, and so one has to trust 
the linear programming software in that case. 
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the above example, it took about 15 minutes to solve the problem, but it can take much 
longer if more integer constraints are imposed. It is tempting to think that if one could 
allow the number of integer variables to grow very large without significant time penalty 
then there would be no limit to the improvement that could be obtained. We offer the 
following theoretical evidence in favor of that belief. 

Definition 4.1. Let S = {z £ C : \$s(z)\ < 1/2}. A divisor on S is a function m : S — > Z 
which is supported on a discrete subset of S. A divisor m is admissible ifm(—j) = 771(7) — 
for all 7 G S and there is a number A > such that Y2 j^s rn(-y) <C T A for all T > 1. 

|7|<T 

Proposition 4.2. Let m : S — >■ Z> be an admissible divisor, d G M x , and {c n }^L 2 a 
sequence of complex numbers satisfying lmin^oo c n = 0. Suppose that for every smooth, even 
function g : R — > C of compact support and cosine transform h we have the equality 

00 

0(0) log \d\ = ^2 m (l)Kl) + 2 Yl c «#( lo S n ) 

■y£S n=2 

+ 9 (0) log(8^) - [ g^||| dx + ( Sg n„ jf" ^J^y &■ 

T/ien d is a fundamental discriminant, c n = /^ f or every n > 2, and 771(7) = 
ord s= i/ 2 +i 7 L(s, Xd) for all 7 G S. 

Thus, the explicit formula is rigid in the sense that the only identities of the shape (pi) 
that can hold for all test functions are the ones arising from quadratic character L-functions. 
We remark that the key to this proposition, whose full proof is given in the appendix, is 
that m is integer- valued and supported on a discrete set. Unfortunately, the proposition is 
ineffective, in that it does not predict how many or how complicated we must choose the test 
functions before finding a system that yields a good lower bound for log \d\. However, note 
that under GRH, the AG J 7 with |A| < x are distinguished from one another by the values 
of Xa(p) at primes p < 0(log x). This statement alone does not offer any indication of how 



to find A given a list of its initial character values, but together with Prop. 4.2 it suggests 



that a given A might be captured by the system ( 10 ) using test functions supported up to 
X pa 2 log log |A|, provided that we are allowed to take V and K arbitrarily large. However, 
our numerical experiments so far, which were limited to at most a few hundred integer 
variables, have not corroborated this speculation, even allowing for larger values of X. 

4.2. Finding correlating characters. 

4.2.1. Lining up the initial primes. In order to improve the efficiency of the brute force 
search, we chose q so as to line up the values of the prime sum for small n, i.e. so that 
X?a(jj) = 1 for small primes p. Of course there is no guarantee that doing so is optimal, and 
indeed it is likely that the best choices of q of a given size adhere to this principle only loosely, 
i.e. they may sacrifice a few small values of p in order to line up many more. However, if we 
have the resources to evaluate the prime sum for a fixed number of q, regardless of size (a 
reasonable assumption, since the only operation performed on q itself is reduction mod p), 
then it makes sense to line up the small primes in attempt to skew the distribution of values 
in our favor. 
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To be more precise, consider an idealized form of the lower bound J6J) with h a (^-function 
and g = 1. Then for a prime power n = p k , the corresponding term of (IM is 2x q /\(n)h{n) / \fn. 
The expected value of this term, that is its average value over all q G J 7 , is easily seen to 



be if A; is odd and 24 



A(n) 



p+1 v 7 " 



if k is even 
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Thus, if we force q to satisfy Xq&ip) — 1> this 
introduces a positive bias in the prime sum (after summing over k) of 2 ° g p ( y/p H — ^ ) . 

However, it also comes with a price, in that we expect such a q to be about 2(p + l)/p 
times larger than a fundamental discriminant chosen randomly without regard to the value 
of Xqip)- Thus, our expected net improvement is 



(11) 



21ogp 

p — 1 



Vp + 



p + i 



log 



2(p + l) 
p 



(A similar argument applies to forcing Xqa(~ 1) = 1 ; from which we expect a net improvement 
of | — log 2.) It turns out that (11 ) is positive for p < 251 but negative for larger primes. 



4.2.2. The shortest lattice vector problem. It is plausible that there is a better strategy 
for finding good twists than a brute-force search, meaning a strategy that can find the 
same quality twist as brute-force but using much less sampling. If one could be assured 
of finding 7 X (gA) 3> V^V log |A| in a subset of {q G T : (q, A) = 1, \q\ < expX} of size 
«C exp(X T ), < r < 1, then one could improve the 2/3 exponent to max{6^ in , ^+i}y where 
^min := m f<9>o?7oo(#), provided the subset can be determined easily. An obvious candidate is 
the subset of smooth fundamental discriminants. For example, one could search for a product 
of real primitive characters Xq — Xqi ' ' ' Xq m -. \lj\ < Q-> Qj ¥" Qk, that correlates strongly with 
Xd, so as to make 



(12) 



E 

p<p 



logp 



E 

p<p 



Xq(p)Xd(p)lQgP 

Vp 



+ log \q\ 



small, in the hope that it will lead to an unusually large prime sum in the explicit formula. 
The question of finding a good choice of Xq can be framed in terms of finding a short vector 
in the lattice generated by the rows of the following (n + m + 1) x (n + m + 1) matrix, 
as we explain next. (This idea was applied in [17] in the t- aspect to disprove the Mertens 
conjecture.) 



' i(d,pi) 

i(qi,pi) 
i(<l2,Pi) 

Kim, Pi) 

2w(pi) 




i(d,p 2 ) 

i(qi,P2) 
Kq2,P2) 

i{q m ,P2) 
o 

2w(p 2 ) 



i(d,p n ) 
i(qi,Pn) 

i{q2,Pn) 

i(q m ,Pn) 
o 


2w{p n ) 





L2 M v^ 




g ?1 












L2 M v^ 







IftlJ 









2 M- 






L2 M v / lo^JJ 






16 E(x 9 a(p 2 ' £ )) = E(x?a(p 2 )) = ^23i = r£j, where the second equality holds because q € J 7 , so that q ^ 



(mod p ). 
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w(p) :- 



2 M y/\^p~ 



P 



1/4 



i(q,p) : = o ( 1 + Xq*{p)) w (p), <f 




9-1 



2 g if g odd prime, 

ifg€{-4,8,-8}. 



Here, M is a large integer of our choice (in our application it was a random integer in 
[75,150)). The weight w(p) comes from (12), and indicates that it is more important to 



correlate smaller primes. The weight v/log \qj\ in the (nH-l)st to (n + m)th columns indicates 
that using \q* w iU incur a penalty imposed according to the explicit formula. The bottom n 
rows indicate that the fcth entry in each row, 1 < k < n, should be treated modulo 2w(pk). 
Here, it is helpful to note that i(qj,Pk) is either or 1 times w(pk), and so working modulo 
2u>G»£i) essentially means that only one multiple of each row is needed. Therefore, a vector 
in the lattice with a non-zero (n + m + l)-st entry, can be written in the form 



(13) 



yiw(pi),...,y n w(p n ),u 1 y/l 



og gil 



a, 



^ 



og\qn 



•>M 



where yk,Uj G {0,1}. The character generated by this vector is Xqj := W~<=.jXq*--> where 
J := {1 < j < m,Uj = 1}, and it has discriminant qj = YljejQj- The yu are or 1 



according to whether Xj(Pk) = Xd(Pk) or not. Hence, in order for the vector (13) to be 
short, it means that 

Yl w (Pk) 2 +Yl 

Xqj (Pk)¥"Xd(Pk) J^J 



->Ai 



tog I & I 



+2 



1M 



i2M 



E 

Xqj('Pk)¥=Xd(Pk) 



logPfc 

Vp 



+ J2 1 °s\qj\ + i 



jcj 



has to be small. This expression is essentially the same as (12), which we wish to minimize, 
but with Xq — Xqj an d q = qj- Thus, it is seen that one can find a good choice of Xq if one 
can find a short vector in the lattice. 

Finding the shortest vector in a lattice is iVP-complete and is conjectured to be iVP-hard 
in the L 2 -norm (see pQ). However, one can find relatively short vectors in polynomial time 
using the LLL algorithm of Lenstra, Lenstra, and and Lovas [T5], which produces a basis 
that is nearly orthogonal. The LLL algorithm was first used to factor a primitive univariate 
polynomial in polynomial time. It does not necessarily find the shortest vector, and it usually 
does not, but it can find relatively short vectors quickly. 

We applied LLL to our lattice with P and Q ranging between 100 to over 1000. While 
it did yield above-average choices of Xq-> suc h as g = —73147, our best-performing twists 



ultimately came from the brute-force approach described in £4.2.1 



4.3. More general twists. If tc is a cuspidal automorphic representation of GL r (Ao) with 
conductor q relatively prime to A, then the twist tt (g> xa has conductor g|A| r . Assuming 
GRH for the associated L-function L(s,n <g> Xa), we get a lower bound for log |A| via the 



explicit formula. Thus, the idea of using twists as in £2.2 admits a vast generalization. 



For any natural family of twists, one can expect a more general version of Prop. 2_4 to 
hold, i.e. for each X > there will be some optimal choice of input data (tt, g), where tc is 
an element of the family and g G C(X) is a test function to use in the explicit formula. For 
instance, considering the family of quadratic twists as in §2.2 it is easy to see that the right- 
hand side of (|6j) is bounded above by Ox(l) — log \q\, uniformly for g G C(X). Thus, only 



finitely many q are relevant, so it follows from Prop, 
which maximizes the lower bound (pi). 



2.1 



that there is a pair (g, g) G J 7 x C(X) 
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Similarly, one can expect an analogue of Prop. 2J3 to hold for any given family. It would 
be of interest to study other families to see which yield the best performance. We con- 
clude by listing a few families of L-functions that would make good candidates for future 
investigations. 

• Elliptic curve L-functions. Can one make use of the BSD conjecture and the existence 
of high-order zeros at the central point to force zero repulsion? 

• Dedekind (-functions. Can one make use of the existence of towers of number fields 
of bounded root discriminant? 

• Rankin- Selberg products. Can one make use of the algebraic structure of the coeffi- 
cients of L-functions to find correlating twists quickly? 

Appendix A. Proofs 



A.l. Proof of Prop. 2.1, Let g n G C(X), n = 1, 2,3, . . ., be a maximizing sequence for 1(g), 
with corresponding cosine transforms h n . Since each h n is non-negative, we have 1^(^)1 <• 
<7n(0) = 1. Therefore, for j G Z> , 



x 



2{-\y I x 2 ig n (x)dx 
o 



+i 



< 



X 2 i 



j+r 



(14) |/#fl(0)| 

so that /in (0) varies within a compact set for each fixed j. Applying Cantor's diagonal ar- 
gument, we may assume without loss of generality that the sequence {h n 3 (0)} _ 1 converges 
for every j. Put Cj = rim n _ HX j h n 3 (0) and h^t) = J2°jLo T^y.^ 2 ' °- Then from ( |l4~| ) it follows 
that /ioo is an entire function and h n (t) converges uniformly to h^t) on compact subsets of 
C In particular, h^t) > for all I6l, 

Next, for any g G C(X) with cosine transform h, we have 

log 8vre 7 - / - , dx + xa (-1) / = w /0 x dx 

J 2smh(x/2) J Q 2cosh(x/2) 



[$&(l + a + it)h(t)dt, 

Jm. f R V ^ / 



i r^nn 

n 



where T R (s) = n- s / 2 T(s/2) and a G {0,1} is such that (-l) a = xa(-1)- By Stirling's 
formula we have 3?j^(| + a + it) = \ log(l + |£|) + 0(1), so that 

- / ^ ( - + a + it] h(t) dt=-[ log(l + t)h(t) dt + 0(1). 

Moreover, since \g{x)\ < 1 for all x, we have 

of A(n)x A (n) , , 

2 > — g(logn) < 1, 

n=l v 

where the implied constant depends only on X. 

Returning to our construction, since g n is a maximizing sequence, we may assume without 
loss of generality that l(g n ) is bounded below. Together with the above observations, we 
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thus have that i f^° log(l + t)h n {t) dt < C for some constant C Hence, for any T > 0, we 



have 



71 



log(l + t)/zoo(t) eft = lim - / log(l + t)h n (t) dt < C. 



n— >oo 71" 



Since T is arbitrary and log(l + t)hoo(t) is non-negative, we see that 



(15) 



i r°° 

- / log(l + t)h 0O (t)dt<C. 

7T Jo 



In particular, /i^ G L 1 ([0,oo)), so its cosine transform g^x) = - J °° /loo(^) cos(xi) rft is 



well-defined and continuous. Moreover, by (15) we have 



7T J T 7T J log(l + T) log(l 4 



T) 



and similarly i J^° /i n (t) (it < lo (i+r) f° r all T > 0. Therefore, 



Jnl^J 



7T 



/t n (t) cos(xt) dt 



< 



C 



log(lH-T) 



and 



7T 



^(x) / hoo(t) cos(xt) dt 



< 



C 



log(lH-T) 



Now, let e > be given, and choose T > large enough that lo n +T ) < §• Further, let 

iV G Z> be such that n > N implies that \h n (t) — /ioo(0l < §r £ f° r a ^ ^ e [0> ^T Then the 
above inequalities yield 



\g n {x) -^oo (a?) I < 



2C 



log(lH-T) 



TV 



{h n (t) — hoo(t)) cos(xt) dt 



< e 



for all n > N and x > 0. Thus, g n (x) converges uniformly to g^x). In particular, g^ is 
supported on [0, X] and satisfies ^oo(O) = 1, so it is an element of C(X). 

Finally, let 5 > be given. By Stirling's formula, there is a number T > such that 

^fV2 + a + ^) > whenever \t\ > Tq. Moreover, it follows from (15) that the function 

r' / 1 \ 

3?^ (J + a + it) /ioo(£) is absolutely integrable, so there exists T >T such that 



0<- [ SR i ^(l ) +a + it)h 0O (t)dt<6. 



-T,T] 
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Therefore, 



l( goo ) + 5>2j2 



n=l 



n 



vr Im 1 id \ 2 



lim (2V 



A(n)xA(n) 



n=l 

00 



77 



> lim 2V 

rn — ^rvi I ' ^ 



A(w)xa(w) 



n 



m (log n) 



cj m (log n) 



7T 



T 



F' /1 



-T 

1 lh/i 



+ a + it] h m (t) dt 



IX 



n=\ 

= sup 1(g) > l(goo). 

geC(X) 

Since 5 is arbitrary, we have I (goo) = sup g6C / X ) 1(g). 

A. 2. Proof of Prop. 2.2[ We begin with some lemmas. 
Lemma A.l. For v > 0, define 



^ [ 2 + ° + %t ' km ^ dt 



D 



fu(x) 



(l - x 2 ) u if \x\ < 1, 
otherwise, 



lr{t+2u)i fMfv {§- y]dy /orx - ' 



9u,x{x) = 

and h v> x(t) = 2 J °° g Uj x(x) cos(tx) dx. Then g v ,x(fy — 1? 9u,x is supported on [0, X], h v> x is 
non-negative and satisfies 

2u+2 



K,x(t) «e v-^e^X 



4u 



Xt 



uniformly for 



Xt 



4 



> u> e, 



for any fixed e > . 

Proof. Using the Poisson representation for the J-Bessel function, we derive 

2 



f v (x)e« x dx = 2T{l + u)j v (\t\) 



where j v (u) = >Jl^Ju+i/2(u) is the spherical Bessel function with parameter v. From this 

as u — > + , we derive 

0rr(l + 2v) 



and the limit j v (u)(^Y — > 



2r(|+v) 

f v (x) 2 dx= / f 2l y(x)dx = 

Jr 

so that (?jy : x(0) = 1. Therefore, we have 

2xr(l + 2z/)r(i + z/) 2 



rd + 21/) ' 



/^,x(£) 



-J» 



Xltl 



4 



xt 



2/- 



A r(i + 2z/) 

It follows from [TIJ Thm. 2] that the function uj u (u) is bounded in the region {(u, u) : u > 
2v > 0}. This combined with Stirling's formula gives the estimate. □ 
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Lemma A. 2. For a > 0, set 



a 



2 / at \ a 



HM = -*u*{-j+ t 



sine 



at — tt 



+ sine 2 



at + ii 



(id 



d G a (x) = | J °° H a (t) cos(tx) dt for x > 0. Then G a is supported on [0,a], G a (0) = 1, 



and H a {t) > 



2a 



for t G 



max((at) 2 ,7r 2 /3) 

Proof. A straightforward calculation gives 



(16) 



G a (x) = max(0, 1 ) cos 2 ( — ) 

V a/ \2aJ 



which yields the stated properties of G a . As for H a , by rescaling, it suffices to prove the 
bound for a = 1. A calculation shows that 



"m-^G^v- 



so that H\(t) > 2t 2 for \t\ > n/y/3. On the other hand, graphing the function verifies that 
Hi{t) > 6/vr 2 for \t\ < tt/V3. D 



Now, turning to Prop. 2.2, first note th at \qA \ > 3. We take h(t) = h u ,x( t), w here 



5X^1 



> \ log log 3 > 0. Applying Lemma A.l with e = \ log log 3 and Lemma 



A.2 



a = 2 log log |gA|, for t > 8 we have 

hit) « u-^e^X (^) 2U+2 < X(5X)- l / 2 e- 5x / 2 a- l m^ ((a5) 2 , ^ H a (t) 
e- A X max((251oglog|gA|) 2 ,vr 2 /3) 



with 



- (2 log log | qA |) 3 / 2 log|gA| ~ Ha ^ ) ' 

Since jj(qA) > 5 for every j, this yields 

e~ A X max((251oglog|gA|) 2 ,7r 2 /3) 



5>( 7i (gA))< 



i=i 



(loglog|gA|) 3 / 2 



log|qA| 



£fla(7i((?A)). 



i=i 



We estimate the latter sum by plugging back into the explicit formula (|3| , with A replaced 
by qA. A calculation with the prime number theorem using (16) shows that 



£^G„(log„)« 



n=l 

and it is not hard to see that 

r i-G a (x) 

l 2 sinh(a;/2) 



a/2 



ft' 



log(8vre 7 ) 



dx + x,a(-1) I" ° a }^ /n , dx -- ()( I i 
Jo 



2cosh(x/2) 



uniformly for a > 21oglog3. Thus, Y^jLi ^a(7j(?A)) <C log \qA\. 

Finally, by [IHl Thm. 11], under GRH we have 5 <C 1/ log log |<?A|. This yields ([7]). D 
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A. 3. Proof of Prop. 2.3 We may assume without loss of generality that N is odd, not 
a perfect square, and satisfies N > exp(exp(C 2 / 3 )), where C > is the implied constant in 
0Q Thus, if we set d = (-1) ^N then d = A£ 2 for some 1 ^ A G 7 and £ G Z >0 . 

Let Q > 3 be an integer parameter to be specified later. Set v = | log log N + 12 i Q j~ jv > 
X = Au log Q, and let T be an integer in the interval [e x — 1, e x + 1). (Note that to find such 
a T, it suffices to compute e x to within ±|.) We let q run through all elements of T with 
|g| < Q and evaluate the lower bound (|6| using the test function g = g v ,x-, in the notation 
of Lemma IA.ll 

Since g Vt x (log n) = for n > T, it is enough to consider the terms of the sum for n < T. 
As described in §|1.2 since A is unknown to us, we compute Xd(n) in place of Xa(^)- If for 



any prime value of n we find a zero value of Xd(n), we check to see if n 2 \N and exit with this 
square factor if so; otherwise Xa(^) = Xd( n )- In particular, while computing Q for q = 1, 
we evaluate Xa(^) for all primes n < T. If (T + l) 3 > N then this alone yields enough 
information to determine whether iV is squarefree. Hence, we may assume without loss of 
generality that X < \ logiV, so that logQ < M. 

and 5 = ^-^ then g Ut x is precisely the test 
Using the bound gA| < QN < N 1+ r^, we 



3 —to - ' J "" X"""- —to ^5 _ 12l y 

gA 



2.2 



Note that if we set A = 2u — log log 

function exhibited in the proof of Prop. 

derive the inequality 1 — e~ A > ^(loglog iV) -2 > 0. Thus, Prop. 2.2 shows that if |A| = iV 

(which holds when iV is squarefree) and 71 (gA) > ^-^ then 

^^(7i(?A)) < ^1 - 72 ( loglogiV )2j (i gi g( g AT))3/2 ~ y 1 ' 72(loglogAT) 2 J X ' 

Therefore, if we evaluate (J6J) to within ± 72(lo x o N)2 , we will have proven that | A| > Ne~ 2X , 
so that £ < T. Having already determined all prime factors of iV up to T, we will thus have 
found a proof that iV is squarefree. 

Now, since the value of 9* is unknown to us, we cannot say in advance what value of Q will 
suffice. In our algorithm, we therefore apply the above procedure iteratively with Q = 2 k 
for k = 2,3,4, .. . until we find either a square factor or a proof that iV is squarefree. As 
noted above, the algorithm must eventually terminate. If it turns out that 6* = 1 or if N 
has a square factor then the algorithm becomes a rather inefficient version of trial division, 
which nevertheless runs in polynomial time in iV; in particular, the 0(exp[(log iV) 14 ^ 1 )]) 
running time estimate holds. Henceforth we will assume that 6* < 1 and that the input iV 
is squarefree. 

Fix £ G (0, 1 - 6*). From the definition of 9* it follows that ^(9* + e) < 9*. Thus, there 
exists N (e) G Z >0 such that tja(9* + e) < 9* +e whenever |A| = iV > N (e). Let us assume 
that iV > N (e). Then once {gjgg > 9* + e, there must be a q with (q, A) = 1 and \q\ < Q 
such that 71 (q A) > y^. 

It is straightforward to see that all of the floating point operations required to compute ^ 
for every \q\ < Q to the precision described above may be carried out in time 0(Q 1+iv log c iV) 
for some c > 0. Since we choose values of Q from a geometric progression, the total running 
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If C is at all large then this rules out every N of practical size; we could deal with this instead by 



increasing A in Prop. 2.2 by a constant, but as we are only interested in the theoretical result, we make this 
assumption for convenience. 
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time is dominated by that of the final iteration. In the worst case, it might be that the 
smallest Q for which ^"^ > 9* + e is 2 k + 1 for some k, and thus our final choice of 
Q = 2 k+1 would be too large by roughly a factor of 2. Thus, logQ < (\ogN) e * +£ + log 2, so 
that Q 1+4l/ log c A < exp[(logiV) e * +£ (l + 4z/)] (log A^) c+lo § 4 . Since 1 + 4v < log log iV and e 
may be chosen arbitrarily small (assuming only that N > Nq(e)), the running time is thus 
O (exp [(log 7V) r+o(1) ] ) , as required. □ 



A.4. Proof of Prop. [3^3 

Lemma A. 3. For each N > 1, and each s G (0, it), we have 
(17) 

W USp (2N){0i>s) 1 / sin( S /2)\ 27V+1 1 / sm(s/2)\ 2N+ ' f Ns\ 



1 * cos (s /2)-(- +1 ) ^H 1 + ^^J + 2l 1 -^^J - eXP l7lJ- 

In particular, we have 

(18) logP^ p(2J v)(0i > s) = [2 + 0((iV S )- 1 )]iV 2 logcos( S /2), 

uniformly for s G (0, 71"). 
Proof. The Weyl integration formula on USp(2N) gives 

Pc/5 P (2JV)(#1>S) = / 1 

JUSp(2N) 

2 N 



9 1 >s(ir U Sp(2N) 
<USp(2N) 



/ "Q (cOS^fc-COS^j) 2 JJ 



2 



(sin <»,• I affli • • • aty N . 



l<j<k<N l<j<N 



We proceed along the lines of the proof of [HJJ proposition 6.10.1][jj Applying the change 
of variable </>j = 2tj, the trig identities sin(27}) = 2sinTj cost, and cos(2tj) = 2 cos 2 Tj — 1, 
and, last, the substitution Wj = cos 2 r,-, we obtain fusp(2N) (#i > s) = I]y(cos 2 ( y s/2)), where 

Jjv(A) := / Yl {wk-Wjf Y[ \Jw j (l-w j )dw 1 ---dw N , 

J[0,\) N i<j<fe<jv l<j<N 

and C(N) := 2 2JV +N /tt n . The change of variable iUj = Ax.,- thus yields 

(19) I N (\)=\» 2 +W C ^p- f J] (X fc -X,) 2 J] yfx j (l-\x j )d Xl ---dx N . 

J[0,1) N i<j K k<N l<j<N 

Since ^/l - Ax,- > ^/^x] for < A < 1, we have I N (X) > \ n2+n / 2 I n (1). The lower bound 
follows on observing that 7jv(l) = IVsp(2iv) (#i > 0) = 1. 

By comparing the joint probability density functions of eigenphases in USp(2N) and 
SO(2N), one easily obtains the rough upper bound Wusp(2N){Qi > s) < 4: N F SO (2N)(8i > s). 
Combined with the estimate ^so(2N)(9i > s) < cos(s/2) 2N ~ N from [10, proposition 6.10.1] 
and the lower bound ijv(A) > \ N +7V / 2 , this yields the asymptotic (18) in the range (3 G (1, 2). 



18 1Pg(ao(#i > s ) is eigen(0,s,C7(iV)) in the notation of [10]. 
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To derive the upper bound, and consequently the asymptotic, in the full range, we apply the 
first-order inequality 



^/T^Xx~ = y/T=T j Ji+ { \ A)Xj < y^T^ (i + (1 " A) ' r 



-L t\j n \ V "^ 1 I 



i+a i^ri '- x x 



2 v " 3 V " i A I - x , 

to get 

N /I ! \\ N N /1\\# J 



i=i v 7 i=i jc{i,...,tv} v J jeJ 



3 = 1 V J Jc{l,...,iV} V 7 j6{l,...,iV}\J 

where J runs through all subsets of {1, . . . , N}. We insert this into (19) and permute the 
variables so that {1, . . . , N} \ J is mapped to {1, . . . , iV — # J}. Collecting the terms with a 
common value of # J, we have 

I N (X) C{N) (i + \\ N "(N\ f l-X 

X N* + N/2 - N \ I 2 J 2^[ m J\l + X 
\ ' m=0 \ / \ ' 

AT , AT-m 

1 



• / n (** ~ ^) 2 n \h~dcY- n ^ ~ ^ rfxi ■ ■ ■ ^ 



l<j<k<N j=l » ■> j~- 

By Aomoto's formula [131 (17.1.6)], the integral may be written in the form 



JLi 1 + 2iV - j AAr r(m+|) T(l + 2iV) ' 

where K^ (a Selberg integral, see [HI (17.1.3)]) is independent of m. When m = 0, the 
integral is easily recognized as the one occurring in (19) with A = 1, so that 

1 = 1 (1) = C(N)K N T(N+l) T(l + N) 



n\ r(|) r(i + 2N)' 

Solving for C(N)Kn/N\ and substituting back into the above, we have 

i N (\) < (i + \y"(N\ fi-x\ m r(|) r(i + iv + m) 



A tv2+v/2 - v 2 ; z^ujii + a,/ r(m + ±) t(i + n) 

l_ + X\ N "/N + m\ /1-A xm 
2 / ^1 2m j\ 1+A 

' m=0 v / \ ' 

The last sum is known as a Morgan- Voyce polynomial; it is closely related to the Chebyshev 
polynomials, and may be evaluated in closed form. Precisely, if t is such that cosh t — ' ! 



l+A' 
24 



then it follows from [23j (Hk)] that the last line is 



2N+1 / , \ 27V+1 



cosh((2iV+l)t) 1/ /i-A\ 1/ /1-A 

cosh^t = 2{ l+ v—) + 2{ l -y^ 

The upper bound follows on putting A = cos 2 (s/2) and noting that 

1 / sin(s/2)\ 2Ar+1 1 / sin(s/2)\ 2JV+1 [Ns 

Combining this with the lower bound and the inequality | logcos(s/2)| > s 2 /8, we get the 
estimate 

logPcrsp(2JV)(0i > s) = N(2N+l)logcos(s/2) + 0(Ns) = [2 + O^s)" 1 )] Af 2 logcos(s/2). 

D 



Turning to the proof of Prop. 3.3, by definition of P{/s p ( 2 Af)(max 1 < m < A f 6 l 1 (m) < s), we 
have, for each s G [0,7r], 



(20) V USp{2 N) ( max ^(m) < s) = F USp{2N) (6 l < s) M . 



Suppose e < 4 and s G [s e JN), s^JN)]. Then s — > as N — > oo (since (3 < 2 by 
assumption). Further, by Lemma [A3 we have ^uSp(2N)(di > s ) — ► as N — Y oo (since (3 > 
by assumption). Using (20), and Lemma A. 3 again, yields 

(21) 
logPc7 5p( 2iv)( max 0i (m) < s) = Mlog(l - F USp(2 n)(0i > s)) 



v l<m<M 



= -MP rap(2JV) (0 1 > S )(l + o(l)) 

= - exp((2AT) /3 + [2 + O((M0 _1 )]iV 2 logcos(s/2) + o(l)) 

= -exp((2AT)' 3 - (1 + o(l))(A^/2) 2 + o(l)), 

where we used that log cos(s/2) ~ — s 2 /8 in the last line. Soifs = s~JN) = (4—e)(2N)^^ 2 ~ 1 , 
then 

hgF USp(m) ( max ^(m) < s) = - exp((2iV)' 3 - (1 + o(l))(l - e/4) 2 (2Nf + o(l)) 

(22) l<m<M 

= -exp((l + o(l))(2A^) /3 (8£-e 2 )/16 + o(l)) -^ -oo, 

as A" — > oo, provided that < e < 4, which ensures that (2N)P(8e — e 2 )/16 — > oo. If s > 4 
then clearly the result still holds. Therefore, for each e > 0, we have Wusp(2N)(se p(N) < 
meLXi< m < M #i(m)) — > 1, as claimed. 

Similarly, if s = sJ^iV) = (4 + £)(2A^) /3 / 2 " 1 , then 

logP C /5 P ( 2 v)( max 0i(m)<s) = - exp((2iV)' 3 - (1 + o(l))(l + e/4) 2 (2iV)' 3 + o(l)) 

l<m<M 

= - exp(-(l + o(l))(2JV)^(8e + e 2 )/16 + o(l)), 
which tends to with N. Therefore, P[/5p(2iv)(m ax i<m<M0i("^) < s tp(N)) — > 1. □ 
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A. 5. Proof of Prop. 3.4 We make use of the main results in [6] and [12] . formulated in 
Lemma IA.4I here. 

Lemma A. 4. For 5 > fixed, there exists a (large) positive constant s such that 

log F U{N) (9 1 > 2s) = N 2 log cos(s/2) - - log (A sin(s/2)) + c + O (1/(A sin(s/2))) 



d N 2 

— logP^jv)^! > 2s) = -— tan(s/2) 



1 



cot(s/2) + 0(1/(N sin 2 (s/2))), 

for all n > sq and 2sq/N < s < n — 5, where cq is an explicit constant. 

Proof. Clearly, Fu(n)(9i > 2s) = f v , N \ ^e 1 >2sdF u ^ N y By the rotational invariance of Wu(N), 

this is the same as fy,™ ^0 1 ,...,e N ^{o,s]u[2ir-s,2it] d$*u(N)- Further, by [H Lemma 2] and the Weyl 

integration formula on U(N), this is the Toeplitz determinant det^xNif " s e % ^~ k ^ e d6/(2ir)), 
for which the relevant asymptotics are supplied by formulas (8) and (12) in [6]. □ 



Remark. Lemma 6.8.3 in [10J furnishes the following interesting factorization of the gap 
probabilities: fu(2N +i)(9 i > 2s) = Pso(27v+2) (9\ > s) Wusp(2N){9i > s )- So, as a direct con- 



sequence of Lemmas 



A.3 



and 



A.4 



we obtain log P SO (27V) (6 1 ! > s) = (2 + o(i))N 2 logcos(s/2), 



provided that Ns — > oo as A — y oo. Note that the machinery of orthogonal polynomials 
supplies general, but involved, methods to derive precise asymptotics for determinant ex- 
pressions of gap probabilities like those in (23); e.g. see [3j and [7]. In the case of USp(2N), 
for example, the gap probability can be expressed as Toeplitz+Hankel determinant, or by 
appealing to (19), as a Hankel determinant. 



To prove the proposition, first notq_j 
N 

Vu{N)(02-e 1 >u) 



2ty 



(23) 

Therefore, 

(24) F u(N) (9 2 -9 1 >u) 



1 d 

2d? 



-P. 



U(N) 



(e 1 > 2s) 



s=u/2 



71 

"n 



d 

ds 



\og¥ u{N) {e 1 >2s) 



P 



s=u/2 



U(N) 



(9 1 >u). 



It follows from Lemma A.4 that as A" — > oo, and uniformly for N Ul ~ Y < u < 2ix — 5 
(for any small constant v\ > we wish), that the term controlling the behavior in (24) is 
P[/(at) (6*i > u) . Explicitly, 

Vu(N)(9 2 - 0i > u) = exp((l + o(l))iV 2 logcos( M /4)) . 

If we further require u < N~ V2 (for any small constant z/ 2 > we wish), then u — > as 
A" — > oo and logcos(ii/4) ~ —u 2 /32. Therefore, 

\[9 2 - 9 1 ] >u)= exp(-(l + o(l))AV/8) 



(25) 



P, 



U(N){ 2 



uniformly for A^ 1 l < 2u < N U2 . Thus, by a similar calculation to (21), we obtain the 
result for maxx< 3 -<j^ 9 (jv) \[92ij n ) — 9i{m)]. If, in addition, one maximizes over all the spacings 



See Lemma 3.1 in [2], but note it is missing a factor of 2ir. 
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in each matrix, thus replacing \[9 2 {m) — 9i(m)] by maxi<j-<jv \[9j + i{m) — 9j(m)], then the 
same result holds. For clearly 

F £/wfe(A0 < ^ max § [9 2 -9 x \) < Pt/(iV)fe(iV) < max max |[^ +1 (m)-^(m)]). 

l<.m<Ma(N) l<m<Mp(N) l<J<iv 

Hence, if the left-hand tends to 1 then the right-hand side does as well. In the opposite 
direction, we have 

NF u{N) (i[e 2 -e 1 ]>u+p(N)) = o(i), 



as N — > oo, by virtue of (25). Thus 



logP t/(A r)( max max l[9j +1 (m)- 9 j(m)]<u+JN)) 



= M\ogP u{N) (maxJ[9 j+1 (m) - %(m)] < U +,(JV)) 

> Mlog(l - NF u(N) (l[9 2 - 9,] > u+p(N))) 
= -MNF u(N) (l[9 2 - e t ] > < /3 (iV))(l + o(l)), 

where we used the rotational invariance of Fu(n) in the third line. Hence maximizing over 
1 < j < ^ does not change our previous calculation more than does increasing the num- 
ber of samples by a factor of N, which affects lower order terms only. Thus, as before, 
-MNF u(tN) (l[9 2 - 0i] > u+p{N)) -> 0, and so F U ( N )(max.i< m < Mf) (N) maxi<j-<jv \[9 j+1 (m) - 
Bi{m)\ < «+,(#)) -► 1. D 



A. 6. Proof of Prop. 4.2[ By the growth estimate for the divisor m, there is a Hadamard 



product F(z) which is entire of finite order, even, satisfies ord z=7 F(z) = 772,(7) f° r an 7 £ 
5 and does not vanish outside of S. Moreover, jp(z) has at most polynomial growth in 
horizontal strips outside of S, so that ^(z)h(z) decays rapidly in such strips for any h as in 
the statement of the proposition. By the argument principle, for any c > 1/2 we have 

If F' 1 /" F' 



^m( 7 )/7( 7 ) = -— / —{z)h{z)dz-—l — (z)h(z)dz 

TXi 2m JQ(z)=-c * 27T2 y sf2)=c f< 

z)h(z) dz. 



7£S -"^3(2)=-c r ^./S( 2 )= C 

F' 



vri ./q( 2 )=_ c -F 



Next, let a G {0, 1} be such that (— l) a = sgnd, and define 



oc 



A(s) = \d\ s / 2 T R (s + a)exp[J2^ nhs ) an d $(*) = A(^ + zz)- 



v n=2 



log 77 / 2 



By the estimate for c n , $ is analytic for ^s(z) < —1, where it satisfies 



Cr,n iZ . 



n=2 
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Thus, for any c > 1 we have 
1 f $' 



z)h(z) dz 

^s(z)=—c 



= g(0)\og\d\ + - / -* l- + a + it )h(t)dt-2J2c n g(\ogn) 

= J2m(l)Hl) = - f ^(z)h(z)dz. 

Let us now set f(z) = j?(z) — %{z) for $$(z) < — 1. By the above, we see that 
^ Jq( z)= _ c f(z)h(z) dz = for every c > 1 and every suitable choice of test function h. 
Fix one choice of h and consider the Fourier transform 

u(x) = ±- [ f(z)h(z)e-^dz. 

Note that since f(z)h(z) is holomorphic for ^s(z) < —1 and of rapid decay in horizontal 
strips, u(x) does not depend on c. Further, for any fixed x G R, h(z) cos(xz) is also a 
suitable test function, so we have 

1 f 
u(x) + u(— x) = — / f(z)h(z)cos(xz)dz = 0, 

i.e. u is an odd function of x. Combining this with the trivial estimate u(x) <^ c ,h e~ cx , we 
get u(x) <C Cj ft e~ c ' x L 

Using this estimate for some c > 1 together with the Fourier inversion formula 



f(z)h(z)= I u(x)e lxz dx, 

J — oo 

we see that f(z)h(z) continues to an entire function and is odd. Since h is arbitrary, it follows 
from a suitable approximation argument that / continues to an odd entire function with at 
most polynomial growth in horizontal strips. Recalling the definition of / and integrating, 
we see that $(z) continues to an entire function of finite order satisfying $(z) = e$(— z) for 
some e G {±1}, ord 2=7 $(2) = 771(7) f° r every 7 G S, and $(z) 7^ for z S. 

The remaining statements now essentially follow from the converse theorem for degree 1 
elements of the Selberg class [9], except that the proof given there assumes that the Dirichlet 

series L(s) = Y^=i a n n ~ s defined by L(s) = expf^^l 2 j^^2~ s j converges absolutely for 

9ft(s) > 1, which we only know to be true for 9fJ(s) > |. That assumption is not necessary, 
however, and for the sake of completeness we sketch a simplified proof following the method 

of [21]. 

First note that the symmetry of $ is equivalent to the functional equation A(s) = eA(l — s). 
Next, for any a, y > we have 



1 /" 

2 y a n e(na)e- 27Tny = — / L(s)T c (s)(y - ia)- s ds 

^i 2™ y K(s)=2 

= -L / A(s)r R ( S + 1 - a) [y/\d\{y - za)] " s ds, 
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where for any z with positive real part we define z~ s = exp(— slogz) using the principle 
branch of the logarithm. 

By the Phragmen-Lindelof theorem, the integrand decays rapidly in vertical strips, so we 
may shift the contour to 3?(s) = —3/4 and apply the functional equation to obtain 

oo 

2j2ane(na)e- 2nny - (1 - (-l) a )A(0) 

n=l 

= ^~ [ A(s)r R (s + 1 - a) [y/\d\(v - ia)] ~° ds 

Zrrl ./sR(s)=-3/4 

= ^-. I A(s)r M (2 -a-s) [y/\d\(y - ia)] - 1 ds. 

* m JSt(s)=7/4 

Expanding A(s) as |rf| S//2 r K (s + a)Y^=i a n n ~ s an d using the identity r K (s)r K (2 — s) = 
csc(7rs/2), we get 



oo 



2j2 a ne(na)e- 2wny - (1 - (-l) a )A(0) 

= eVW\f:^ [ n-'cscf*^) [\d\(y - i a)Y~ l ds 

tl 2m Ms)=7/4 V 2 / 



n=l 

oo 



oo I M(a+iy) 

2 £ . a+1 ^v-a 

7T 



rr" x /\d\J2 



n \d\(a+iy) 



n=l n \d\(a+iy) 



If a\d\ is not an integer then the last line is O a (l) uniformly for y e (0, 1), while if ct\d\ = n 

is an integer then we get ei p± + O a (l). Since the left-hand side is periodic in a, we conclude 

K\/\ d \y 



that d is an integer and a n = a n+ \d\, i.e. the coefficients a n are periodic. Moreover, since A(s) 
does not vanish for 9ft(s) > 1, it follows from [201 Thm. 4] that there is a positive integer 
q dividing d and a primitive Dirichlet character x (mod q) such that L(s) = D(s)L(s,x), 
where D(s) = V i , d , b n n~ s for certain coefficients b n , with b\ = 1. 

— n — 
I 9 

Let A(s, x) — q s ' 2 Fm.{s + a')L(s, x) De the associated complete L-function. Then we have 
,„„, AM ( \d\\" 2 T K (s + a) nl 

(26) W^) = \T) f^T^) D(s) - 

Moreover, it is easy to see that D(s)A(s,x) does not vanish in some left half plane. Thus, 



to avoid concluding from (26) that A(s) has poles at negative integers, it must be the case 
that and a' = a, so that %(— 1) = sgnd. From this and the functional equations for A(s) 
and A(s,%), it follows that A ^'-| dii-s) 1S an en ti re function. Note that for large T > 0, 
D(s)/D(l — s) has 0(T) zeros and poles with imaginary part in [— T, T]. On the other hand, 
work of Fujii [5] shows that if Xi an d X2 are distinct, primitive Dirichlet characters then 
-M s ) Xi)/-Ms, X2) has ^> TlogT zeros and poles in that region. Thus, we must have x — X, 
i.e. x 1S quadratic and qsgnd is a fundamental discriminant. 
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Therefore, by (26) and the functional equations for A(s) and A(s,x), D(s) satisfies the 



functional equation 



I-' 



(27) D(s) = e(^) D(l-s). 

Next, from the formula for Lis), we have 



(28) -(*)= J2 i 

D & V ^ 



*,.._ ^ W^n) _ 



HvVJ 

where the notation n (— ) means that n is composed only of primes dividing \d\/q. 



Now, from (27) and the estimate ^ — c n = oil) it follows that all zeros of D(s) must 
lie on the line 9ft(s) = |. Therefore, -D(s) takes the form 

n p 

d(s) = n n i 1 - a p^ s ) > 

where n p < ovd p (\d\/q) and \a p ,j\ = 1 for each p,j. It is straightforward to show that this is 
only compatible with (128]) and the aforementioned estimate for the coefficients of the right- 



hand side if n p = for all p, i.e. D(s) = 1 identically. Finally, invoking (27) once more, we 



have \d\ = q and e = 1. □ 
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